Improving Speech Recognition through Automatic Selection of Age Group - Specific Acoustic Models

نویسندگان

  • Annika Hämäläinen
  • Hugo Meinedo
  • Michael Tjalve
  • Thomas Pellegrini
  • Isabel Trancoso
  • José Miguel Salles Dias
چکیده

The acoustic models used by automatic speech recognisers are usually trained with speech collected from young to middle-aged adults. As the characteristics of speech change with age, such acoustic models tend to perform poorly on children's and elderly people’s speech. In this study, we investigate whether the automatic age group classification of speakers, together with age group –specific acoustic models, could improve automatic speech recognition performance. We train an age group classifier with an accuracy of about 95% and show that using the results of the classifier to select age group –specific acoustic models for children and the elderly leads to considerable gains in automatic speech recognition performance, as compared with using acoustic models trained with young to middle-aged adults’ speech for recognising their speech, as well.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparative Study of Gender and Age Classification in Speech Signals

Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...

متن کامل

Robust recognition of children's speech

Developmental changes in speech production introduce age-dependent spectral and temporal variability in the speech signal produced by children. Such variabilities pose challenges for robust automatic recognition of children’s speech. Through an analysis of age-related acoustic characteristics of children’s speech in the context of automatic speech recognition (ASR), effects such as frequency sc...

متن کامل

Speaker age estimation for elderly speech recognition in European Portuguese

Phone-like acoustic models (AMs) used in large-vocabulary automatic speech recognition (ASR) systems are usually trained with speech collected from young adult speakers. Using such models, ASR performance may decrease by about 10% absolute when transcribing elderly speech. Ageing is known to alter speech production in ways that require ASR systems to be adapted, in particular at the level of ac...

متن کامل

Allophone-based acoustic modeling for Persian phoneme recognition

Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014